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Abstract 


Assessment and monitoring of forest structure is frequently done with absolute density 
measures with field forest inventory data and expansion methods. The development of 
basal area and the number of trees estimation functions with data derived from very 
high spatial resolution satellite images enable their short-term and cost-effective evalu- 
ation, allowing also the estimation for the area not requiring extrapolation methods. 
The functions of basal area and the number of trees per hectare are based on crown 
cover obtained with very high spatial resolution satellite images for two evergreen oaks 
and umbrella pine. The three tree species are especially important in the agroforestry 
systems of the Mediterranean region. The linear functions fitted for pure stands of the 
three species and mixed stands of cork and holm oak and of cork oak and umbrella pine 
showed a better performance for basal area than for the number of trees per hectare. 
The inclusion of dummy variables for species composition improved the accuracy of 
the functions. 


Keywords: quickBird image, multi-resolution segmentation, crown horizontal 
projection, absolute density measures, regression 


1. Introduction 


Holm oak (Quercus rotundifolia), cork oak (Quercus suber), and umbrella pine (Pinus pinea) are 
three of the most representative species in agroforestry systems in the Mediterranean basin 
[1, 2]. Their stands have usually low density, open heterogeneous canopies and as main produc- 
tion bark for cork oak and fruit for all [3-5]. They also have other productions, such as agricul- 
tural crops, pasture and extensive grazing. Due to the climate variability of the Mediterranean 
region, these multiple use systems enable risk dispersion and the sustainability in time [1, 2, 6]. 
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Absolute density measures, especially the number of trees and basal area, are referred in silvi- 
culture textbooks as a proxy for competition and growing space at tree level and thus for grow- 
ing conditions at stand level [7-15]. These measures are reported as a single value per stand, as 
the sum of the tree values, reported for a standard area, usually the hectare [7, 13, 16-19], and 
have the advantage of enabling comparison between different stands [7-15]. 


The conventional approach to calculate the number of trees and basal area is using field plots 
of forest inventories, which have a statistical sampling design and an expansion factor when 
working on an area basis [20-25]. Though a high accuracy is attained at field inventory plots, 
they have disadvantages related to the area sampled and the extrapolation methods needed 
for the evaluation on an area basis. 


Remote sensing-derived data have been used to estimate both the number of trees [26-33] and 
basal area [26, 29, 30, 32, 34-40]. The advantages are the estimation for small and large areas 
[25, 41-43]; enabling time series analysis in short time cycles [25, 42]; data collection at differ- 
ent scales, depending on the imagery spatial resolution [42, 43]; and enabling the evaluation 
of the area not requiring extrapolation methods [44-49]. The disadvantages are related to the 
imagery spatial resolution; accuracy decreases with the increase of the pixel size [50—52]. 


2. Materials and methods 


2.1. Satellite images and inventory data 


Two satellite images were acquired as four-band products (Blue, Green, Red and Near- 
Infrared), with the bands pan-sharpened with the UNB algorithm [53]. The image from 
QuickBird satellite (hereafter Mora) was collected in August 2006, has an area of 80 km? (cen- 
tral coordinate 8°4'53.98”W and 38°51'16.12”N) and a spatial resolution of 0.7 m. The image 
from WorldView-2 satellite (hereafter Alcácer) was collected in June 2011, has an area of 
35 km? (central coordinate 8?4028.20"W and 38?27'45.71"N) and has a spatial resolution of 
0.5 m. The forest area is composed of holm oak and cork oak in the former and cork oak and 
umbrella pine in the latter. 


Theimages were orthorectified, georeferenced and atmospherically corrected. Ortho-rectification 
was carried out with the rational polynomial coefficient model (RCP) using the image metadata 
and ASTER GDEM digital elevation model (resolution 30 m) in Envi 4.8 [54]. For georeferencing 
the images, field data points obtained with global navigation satellite system (GNSS) were used, 
which resulted in a root mean square error (RMSE) of 0.49 m for Mora and 0.30 m for Alcácer. 
Atmospheric correction was done using the image digital number for top of atmosphere (ToA) 
reflectance to convert it to soil level reflectance using dark-object subtraction method [55]. 


From each image a vegetation mask per specie by a set of sequential steps was derived, using 
object-oriented analysis [56, 57]. Contrast split segmentation based on the normalized differ- 
ence vegetation index image (NDVI, [58] in eCognition, version 8.0.1 [59]) was used to isolate 
tree crowns, resulting in a vegetation mask. In the second step multi-resolution segmentation 
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[60] enabled the separation of the vegetation mask in different objects, using an algorithm 
with scale, shape and compactness as splitting parameters. The third step used the nearest 
neighbor algorithm to classify the objects per tree specie, in Weka 3.8.0 software [61], based 
on the descriptive statistics of the bands and on vegetation indices (NDVI, EVI, SAVI, and SR). 


2.2. Inventory data 


Each image was divided into a square grid, a function of their resolution, 65 x 65 pixels 
(45.5 x 45.5 m, 2070.25 m?) for Mora and 90 x 90 pixels (45 x 45 m, 2025 m?) for Alcácer. The 
grid area is similar to other studies with very high spatial resolution satellite images (e.g., 
[62]). Composition and crown cover (defined as the percent ratio between crown horizontal 
projection and grid area) were derived from the vegetation mask per specie using ArcGIS 
version 10 [63]. The grids were classified as monospecies if only one specie was present and 
multispecies otherwise. To design forest inventory, crown cover was grouped in three strata 
(10-30, 30-50 and >50%) and a random stratified sampling by proportional allocation was 
applied. 


The field surveys took place in 2011 in Mora and 2013-2014 in Alcácer, measuring 17, 24 and 
32 monospecies plots of holm oak (OR), cork oak (OS) and umbrella pine (PP), respectively, 
and 23 and 49 multispecies plots of holm oak and cork oak (OROS) and cork oak and umbrella 
pine (OSPP), respectively. The total number of plots is 145. The main stand was defined as all 
individuals with diameter at breast height equal or larger than 6 cm for holm and cork oaks 
and 5 cm for umbrella pine. In each plot, for the main stand, diameter at breast height, total 
height and four crown radii (directions in north, south, east and west) were measured [16], as 
well as recorded tree location by GNSS. The number of trees per hectare (N) was defined as 
the sum of all individuals of the main stand per plot, scaled to a reference area, the hectare. 
Basal area per hectare (G) was defined as the sum of the sectional area of all the trees of the 
main stand in the plot, scaled to a reference area, the hectare [7, 16]. 


2.3. Statistical analysis 


In the development of the functions for the number of trees and basal area per hectare, four 
types of functions were considered (Figure 1): (i) per composition classes (OR, OS, PP, OROS 
and OSPD); (ii) not considering composition; (iii) considering crown cover per specie; and (iv) 
considering composition as dummy variables. Dummy variables were defined as a dicho- 
tomic variable where 1 refers to a specific composition (OR, OS, PP, OROS or OSPP) and 0 to 
the other compositions. For (i) and (ii) the functions were fitted using ordinary least square 
linear regression through the origin (Eq. (1), where D is N or G, CC the crown cover and a the 
regression coefficient) and for (iii) and (iv) multiple linear regression with stepwise method 
and AIC as selection criteria (Eq. (2), where a and f are regression coefficients, d the dummy 
variables, i= OR, OS and PP and j = OR, OS, PP, OROS and QSPP). For multiple regression, 
multicolinearity among explanatory variables was analyzed with the variance inflation fac- 
tor (VIF), where no serious multicoliniarity is expected for VIF x 10 [64, 65]. As suggested 
by several authors (e.g., [66, 67]) the sum of squares of the residuals (SOR), the coefficient of 
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Satellite Images Forest inventory 


Figure 1. Model flow diagram (where CHP the crown horizontal projection in m?, CC the crown cover, QR holm oak, QS 
cork oak, PP umbrella pine, and p the plot). 


determination (R’) and the adjusted coefficient of determination (R?) were used to evaluate 
the statistical properties of the models. 


Whenever an independent data set is not available for model validation, re-sampling or jack- 
knifing methods are recommended. One of these methods uses PRESS residuals, calculated as 
the difference between the observed and the estimated value of a variable (in this study N or 
G), as a cross-validation statistic [68-70], which is considered also to improve reliability in the 
accuracy assessment [43]. PRESS residuals are attained by fitting the model iteratively k times 
with n-1 observations (where n is the total number of observations) guaranteeing their inde- 
pendence [68-70]. The validation of the models was tested for bias and precision— the former 
with the mean of the PRESS residuals (Eq. (3), where PRESSrm is the mean of the PRESS 
residuals, y the observed value, y the estimated value and i the number of the observation) 
and the latter with the mean of the absolute PRESS residuals values (PRESSram, Eq. (4)). To 
the former was added the analysis of the 5th and 95th percentile of the PRESS residuals [70]. 
Heteroscedasticity associated with the error term was evaluated graphically with the plots 
of the studentized residuals and the estimated values and the normality of the studentized 
residuals with normal quantile-quantile plots and Shapiro Wilk normality test, for a prob- 
ability level of 0.001 [68, 71]. To enable comparison with other studies (e.g., [36]), the relative 
root mean square error (RMSEr, Eq. (5)) was also computed. The statistical analysis was done 
using R statistical software [72]. 


D=axCC (1) 
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D = a,xCC,+B.xd, (2) 
MV, =, 1,1) 

PRESS. = 7 (3) 
Lily, ip Mal 

PRESS, = ——j— (4) 


(5) 


3. Results and discussion 


3.1. Satellite image analysis 


The vegetation mask per species accuracy resulting from the multi-resolution and object- 
oriented classification, illustrated in Figure 2, depends on several factors, namely the forest 
tree density, the limits between forest trees and other land uses, image date, forest tree spe- 
cies reflectance and crown shape. Stands with low density and isolated forest trees, such as 
holm oak, cork oak and umbrella pine in agroforestry systems, enable the distinction and 
separation of the tree crowns with accuracy in very high spatial resolution satellite images 
(e.g., [62]), minimizing the confounding effects of shadow pixels which occur in tree crowns 
at closer spacings [73]. Conversely, the diffused boundaries between forest tree crowns and 
other land uses can decrease classification accuracy [74]. Thus image dates with the highest 
contrast between different land uses, especially between herbaceous vegetation and forest 
trees, such as the summer (June-August) in the Mediterranean region, produce the smallest 
confounding influence [75, 76]. The other two variables that influence accuracy of the vegeta- 
tion mask per species are the reflectance similarity between species and crown shapes, the 
more similar the lower the accuracy. The forest areas of Mora and Alcácer are composed pre- 
dominantly of isolated trees, though some clusters occur, and the images were acquired in the 
dry summer period (June and August), thus with high contrast between the forest trees and 
the other land uses, especially the forest understory that was composed of soil and dry herba- 
ceous vegetation. Regarding reflectance and crown shape, holm oak and cork oak have larger 
similarities with wide ranges of reflectance and an irregular crown shape when compared to 
umbrella pine that has a narrower reflectance range and nearly circular crowns. Nonetheless, 
the spectral signatures of the tree species are different enough to enable a good classification 
[44, 46-49]. The accuracy evaluated with Kappa statistic [77, 78] encountered true positives 
for 90% of the holm oak and 87% of the cork oak for Mora and 98% of the umbrella pine and 
74% of the cork oak for Alcácer. 
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C] Quercus rotundifolia C] Quercus suber [| Pinus pinea 


Figure 2. Illustration of vegetation mask for two small areas (left) Mora (cork and holm oak) and (right) Alcacer (cork 
oak and umbrella pine). 


3.2. Number of trees and basal area functions 


The variability for N is larger than for G and for QS, QROS and QSPP than for QR and PP, 
denoted by the larger coefficient of variation (Table 1 and Figure 3). The number of trees and 
their spatial distribution in the stand determine their growing space and the tree crown area 
is also a function of the species epinastic control [12] and spacing, with crown closure origi- 
nating crown shyness [14], thus unbalancing the relation between crown cover and basal area 
and crown cover and number of trees. Holm oak, cork oak and umbrella pine have, especially 
in adult trees, weak epinastic control and are shade intolerant [5, 79], thus justifying the wide 
variability observed in the plots. However, positive correlations are observed for CC and G 
for all plots (0.72), stronger for QR (0.86) and QSPP (0.85) and weaker for QS and PP (0.66 
each) and OROS (0.62). These results are expected due to the almost direct relation between 
crown and stem diameters [80, 81]. Conversely, the correlation between CC and N is weaker 
when compared to the former. Positive correlations are found for all plots (0.47), stronger for 
OS (0.72), OR (0.57) and OSPP (0.54) and weaker for OROS (0.36) and no correlation for PP 
(70.04). The number of trees per hectare is frequently associated with the individual stems 
and the stage of development and the stand structure [7-15], giving rise to a higher variability 
in the number of trees (Figure 4). 


In general, for the functions fitted for all composition classes (Table 2), there is an improve- 
ment in the statistical properties of the models from those not considering composition (N3 
and G3) to those that included it as crown cover per specie (N2 and G2), with the best prop- 
erties attained by those with composition as dummy variables (N1 and G1). For the number 
of trees, a reduction of —27.4% for SOR and an increase of 3.2% for R^ from N3 to N1, a 
reduction of -18.096 for SOR and an increase of 2.1% for R^ from N2 to N1 and a reduction 
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Min Max Mean SD CV Min Max Mean SD CV 
(%) (%) 
All OR 
N 5 140 67 28.0 424 29 116 72 27 37.1 
G (m? ha?) 2.6 18.2 8.9 3.3 37.5 4.0 11.1 6.7 2.1 32.0 
CC (%) 9.7 72.6 37.1 15.1 40.6 13.7 67.6 37.6 15.9 42.3 
PHC (m?) 195.8 1470.0 756.1 306.5 40.5 284.2 1399.0 777.7 329.2 42.3 
QS PP 
N 5 135 60 33 54.2 5 119 72 23 32.4 
G (m? ha") 52 15.4 9.1 2.8 30.7 3.8 15.9 9.5 2.6 27:5 
CC (96) 13.4 70.5 35.3 14.0 39.7 13.0 66.7 40.5 17.1 42.3 
PHC (m?) 271.3 1460.2 723.9 293.0 40.5 263.5 1350.3 819.1 346.7 42.3 
QRQS QSPP 
N 19 140 62 29 47.7 15 123 69 30 43.1 
G (m? ha?) 2.6 14.7 6.7 2.7 40.3 2.9 18.2 10.3 3-7 36.0 
CC (%) 14.3 58.8 29.4 11.4 38.6 9:7 72.6 39.8 14.5 36.5 
PHC (m?) 296.9 1216.7 607.8 233.6 38.4 195.8 1470.0 804.9 293.9 36.5 


Table 1. Descriptive statistics per composition classes (where N is the number of trees per hectare, G the basal area per 
hectare, CC the crown cover calculated with satellite data, PHC crown horizontal projection, max the maximum, min the 
minimum, SD the standard variation and CV the coefficient of variation). 


of -8.096 for SOR and an increase of 1.0% for RG from N3 to N2 was observed. For basal area 
a reduction of -60.0% for SOR and an increase of 2.7% for R^ from G3 to G1, a reduction 
of —53.596 for SOR and an increase of 2.5% for R? from G2 to G1 and a reduction of -4.3% 
for SQR and an increase of 0.2% for R^ from G2 to G3 was observed. This is reflected in the 
better function performance of N1 or G1 when compared with N2 and N3 or G2 and C3. 
Bias is the lowest for N1 and G1 while precision is similar for all functions. The better sta- 
tistical properties of N1 and G1 when compared with N3 and G3 could be, at least partially, 
explained by the inclusion of the composition variable in the functions as N and G have dif- 
ferent behavior per composition classes (cf. Table 2). Comparing N1 and N2 or G1 and G2, 
the better performance of N1 and G1 can be explained by the relations between crown cover 
and the number of trees or basal area, for the different composition classes and probably by 
the influence of the clusters of mixed trees. When comparing N1 or G1, with the functions 
per composition class (Ni or Gi, i = 4,5,6,7,8), better model statistical properties are attained 
for N4 and N5 and G8 and G4, which correspond to the composition classes with smaller 
variability between CC and N and G (cf. Table 2), respectively. Nonetheless, in general, bias 
is larger for the number of trees than for the basal area. Similar results were attained for 
above-ground biomass functions with crown horizontal projection as an explanatory vari- 
able [44—46, 49]. The regression coefficients are presented in Eqs. (6)-(9) and Table 3. 
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Figure 4. Crown cover and number of trees and basal area per composition class. 
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Model Independent SOR R? R?. PRESSrm PRESSram Percentile RMSEr 


aj 
iabl 
variables 5th 95th 
N 
N1 CC, dOR, 89,148 0.884 0.879 0.002 0.828 -1.362 1.920 37.0 
dOS, dPP, 
dOROS, 
dOSPP 
N2 CCOR, 105,215 0.863 0.860 0.145 0.794 -1.178 1.966 40.1 
CCOS, CCPP 
N3 CC 113,595 0.852 0.851 0.172 0.802 —-1.105 2.062 41.7 
N4 CCOR 9870 0.900 0.893 0.196 0.912 —].445 1.404 33.7 
N5 CCOS 11,176 0.899 0.895 0.014 0.862 -1.146 1.873 35.9 
N6 CCPP 37,293 0.795 0.788 0.300 0.776 —0.667 2.113 47.6 
N7 CCOROS 19,842 0.836 0.830 0.148 0.748 -1.303 1.873 44.6 
N8 CCOSPP 31,382 0.878 0.875 0.111 0.833 -1.227 1.969 38.0 
G 
G1 CC, dOR, 585 0.955 0.953 0.001 0.794 -1.528 1.763 22.7 
dOS, dPP, 
dOROS, 
dOSPP 
G2 CCOR, 898 0.931 0.929 0.171 0.806 -1.690 1.729 28.1 
CCOS, CCPP 
G3 CC 936 0.928 0.927 0.162 0.810 -1.727 1.661 28.7 
G4 CCOR 39 0.953 0.950 0.237 0.899 -1.301 1.602 22.8 
G5 CCQS 142 0.935 0.932 0.202 0.883 -1.882 1.356 26.6 
G6 CCPP 267 0.913 0.910 0.277 0.873 -1.304 1.627 30.5 
G7 CCOROS 136 0.898 0.894 0.136 0.805 —1.168 2.104 34.3 
G8 CCOSPP 162 0.971 0.970 0.098 0.776 -1.432 1.654 18.1 


Table 2. Statistical properties of the fitted functions (where CC is the crown cover, OR Quercus rotundifolia, QS Querus 
suber, PP Pinus pinea and d the dummy variable). 


Model a Model a 

N3 1.68195 G3 0.22776 
N4 1.7749 G4 0.16797 
N5 1.7041 G5 0.2436 
N6 1.5324 G6 0.21361 
N7 1.9839 G7 0.21583 
N8 1.65806 G8 0.255849 


Table 3. Regression coefficients of the N and G functions. 
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Figure 5. Studentized residuals (a) and normal probability (b) graphics for N1, N2, and N3. 


N1 = 0.8982 x CC + 37.7851 x dop + 28.3933 x d,, + 35.4451 x d, 


+ 35.4373 x doros + 33.1026 x dospp 


P 


(6) 


N2 = 1.7334 x CC, + 2.08536 x CC,, + 1.50642 x CC,, (7) 
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Figure 6. Studentized residuals (a) and normal probability (b) graphics for G1, G2, and G3. 


G1 = 0.14992 x CC + 1.04399 x dop + 3.84375 x d; + 3.39116 x d,, 


+ 2.25474 x d rgs + 4.37992 x di. 


(8) 


G2 = 0.19954 x CC or + 0.2234 x CC, + 0.237722 x CC, (9) 
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The coefficient of determination of the functions to estimate the number of trees per hectare 
with explanatory variables derived from satellite data is rather variable from 0.38 to 0.88 
[26, 29, 30, 32]; an even wider variation is found for the functions to estimate the basal area 
per hectare, from 0.07 to 0.92 [26, 29, 30, 32, 35, 40]. In general, the functions for N and G 
estimation developed in this study have higher R? than those of the aforementioned studies. 
RMSEr decreases from N3 and G3 to N1 and G1 (Table 2) and is smaller than that attained 
value for the basal area as studied by reference Hyvonen et al. [36] (46-53%) and Günlü et al. 
[40] (29-34%). 


For N1, N2, G1 and C2, no serious multicoliniarity is expected as VIF for all explanatory 
variables is smaller than 7.7. The studentized residuals of N1, N2, N3, G1, G2 and G3 do 
not show systematic variations; the normal probability graphs approach the straight line 
(Figures 5 and 6), thought with deviations in the extremes, which is corroborated by the 
residuals not meeting normality (all, p < 0.001). Similar results were attained for N3-N8 and 
G4-G8. 


4. Conclusions 


The vegetation mask per species from very high-resolution satellite images enables deriving 
accurately crown cover. In spite of the similarities between holm oak, cork oak and umbrella 
pine, the methods of multi-segmentation and object-oriented classification allow their distinc- 
tion with accuracy. 


The functions with the best performances for the number of trees per hectare and basal area 
per hectare for the three species were attained with the explanatory variable crown cover and 
composition classes as dummy variables. 


The agroforestry systems in general, and those of holm oak, cork oak and umbrella pine in 
particular, have high spatial variability which makes these functions well suited to evaluate 
and monitor their stands in space and time. 
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